Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't stop for errors dumping invalid names or long names #375

Closed
wants to merge 1 commit into from

Conversation

2600box
Copy link

@2600box 2600box commented Oct 11, 2023

Made some minor changes so that extraction continues despite errors from invalid characters and long names.

See issues:
#373
#318
#213

For example:

❯ ./zim-tools/build/src/zimdump dump --dir=./dump3 archive.zim
Error writing file to errors dir. ./dump3/_exceptions/H%2fplay.google.com%2flog?format=json&hasfast=true&authuser=0&__wb_method=POST&[[1,null,null,null,null,null,null,null,null,null,[null,null,null,null,"en",null,"17",null,null,[1,0,0,0,0]]],1654,[["1696854400954",null,[],null,null,null,null,"[[[\"%2fclient_streamz%2fpo%2fw%2fel\",null,[\"en\",\"rk\"],[[[[\"c\"],[\"O43z0dpjhgX20SCx4KAo\"]],[null,1]],[[[\"c\"],[\"O43z0dpjhgX20SCx4KAo\"]],[null,0]],[[[\"q\"],[\"O43z0dpjhgX20SCx4KAo\"]],[null,0.8000030517578125]],[[[\"S\"],[\"O43z0dpjhgX20SCx4KAo\"]],[null,3.5999984741210938]],[[[\"b\"],[\"O43z0dpjhgX20SCx4KAo\"]],[null,199.0999984741211]],[[[\"i\"],[\"O43z0dpjhgX20SCx4KAo\"]],[null,1.5]],[[[\"r\"],[\"O43z0dpjhgX20SCx4KAo\"]],[null,3440.6000061035156]],[[[\"C\"],[\"O43z0dpjhgX20SCx4KAo\"]],[null,2.4000015258789062]],[[[\"x\"],[\"O43z0dpjhgX20SCx4KAo\"]],[null,0]],[[[\"m\"],[\"O43z0dpjhgX20SCx4KAo\"]],[null,3.100006103515625]]],null,[]],[\"%2fclient_streamz%2fpo%2fw%2frl\",null,[\"mn\",\"ac\",\"sc\",\"rk\"],[[[[\"c\"],[null,1],[null,0],[\"O43z0dpjhgX20SCx4KAo\"]],[null,1887.900001525879]],[[[\"g\"],[null,1],[null,0],[\"O43z0dpjhgX20SCx4KAo\"]],[null,1331.400001525879]]],null,[]],[\"%2fclient_streamz%2fpo%2fw%2fcsc\",null,[\"cs\",\"rk\"],[[[[null,3],[\"O43z0dpjhgX20SCx4KAo\"]],[1]]],null,[]]]]",null,null,null,null,null,null,0,[null,[],null,"[[],[],[],[]]"],null,null,null,[],1,null,null,null,null,null,[]]],"1696854400955",[]]
Error writing file to errors dir. ./dump3/_exceptions/A%2fplay.google.com%2flog?format=json&hasfast=true&authuser=0&__wb_method=POST&[[1,null,null,null,null,null,null,null,null,null,[null,null,null,null,"en",null,"17",null,null,[1,0,0,0,0]]],1654,[["1696854400954",null,[],null,null,null,null,"[[[\"%2fclient_streamz%2fpo%2fw%2fel\",null,[\"en\",\"rk\"],[[[[\"c\"],[\"O43z0dpjhgX20SCx4KAo\"]],[null,1]],[[[\"c\"],[\"O43z0dpjhgX20SCx4KAo\"]],[null,0]],[[[\"q\"],[\"O43z0dpjhgX20SCx4KAo\"]],[null,0.8000030517578125]],[[[\"S\"],[\"O43z0dpjhgX20SCx4KAo\"]],[null,3.5999984741210938]],[[[\"b\"],[\"O43z0dpjhgX20SCx4KAo\"]],[null,199.0999984741211]],[[[\"i\"],[\"O43z0dpjhgX20SCx4KAo\"]],[null,1.5]],[[[\"r\"],[\"O43z0dpjhgX20SCx4KAo\"]],[null,3440.6000061035156]],[[[\"C\"],[\"O43z0dpjhgX20SCx4KAo\"]],[null,2.4000015258789062]],[[[\"x\"],[\"O43z0dpjhgX20SCx4KAo\"]],[null,0]],[[[\"m\"],[\"O43z0dpjhgX20SCx4KAo\"]],[null,3.100006103515625]]],null,[]],[\"%2fclient_streamz%2fpo%2fw%2frl\",null,[\"mn\",\"ac\",\"sc\",\"rk\"],[[[[\"c\"],[null,1],[null,0],[\"O43z0dpjhgX20SCx4KAo\"]],[null,1887.900001525879]],[[[\"g\"],[null,1],[null,0],[\"O43z0dpjhgX20SCx4KAo\"]],[null,1331.400001525879]]],null,[]],[\"%2fclient_streamz%2fpo%2fw%2fcsc\",null,[\"cs\",\"rk\"],[[[[null,3],[\"O43z0dpjhgX20SCx4KAo\"]],[1]]],null,[]]]]",null,null,null,null,null,null,0,[null,[],null,"[[],[],[],[]]"],null,null,null,[],1,null,null,null,null,null,[]]],"1696854400955",[]]

Made some minor changes so that extraction continues despite errors
Copy link
Collaborator

@mgautierfr mgautierfr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your PR !

I think this PR is a bit overkill, only the first remove of throw seems needed.

Comment on lines +241 to +242
//throw std::runtime_error(
std::string("Error writing file to errors dir. ") + (base + ERRORSDIR + url);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commenting the throw std::runtime_error( will just not throw the "error string".

But the std::string is still created for nothing.
Better remove the two lines.

Comment on lines +324 to +325
//throw std::runtime_error(
std::string("Error creating symlink from ") + full_path + " to " + redirectPath;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about this one.
Here we are failing to create a symlink. It can fails for a lot of reason. And even if we fail about a too long names, we must go through the error handling (and here, potentially print a warning only instead of failing if name is too long).

@@ -395,7 +395,7 @@ int subcmdDump(ZimDumper &app, std::map<std::string, docopt::value> &args)
std::string directory = args["--dir"].asString();

if (directory.empty()) {
throw std::runtime_error("Directory cannot be empty.");
//throw std::runtime_error("Directory cannot be empty.");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here. This test is used to not overwrite a existing content in the output directory.
We don't want to remove it.

@kelson42
Copy link
Contributor

We can not just skip these files. They need somehow to be written on the disk and the trace should be written in the exception log.

@2600box
Copy link
Author

2600box commented Oct 19, 2023

We can not just skip these files. They need somehow to be written on the disk and the trace should be written in the exception log.

Definitely would appreciate that. Unfortunately I think it is beyond my coding ability to implement. Hopefully this inspires someone to take a look? Thanks!

@kelson42 kelson42 closed this Mar 31, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants